Analysis of the knowledge about pain in students of Health Sciences

Objetives

  • To evaluate whether there are statistically significant differences in knowledge about pain among students of the Physiotherapy, Medicine, Nursing and Pharmacy degrees.
  • To evaluate whether there is statistically significant difference in knowledge about pain between students of the first and fourth year of the degrees.
  • To evaluate whether there is an interaction between the course and the degree in knowledge about pain.

Methodology

Cross-sectional observational study. The following questionnaires have been applied to the participating students:

  • RNPQ (Reliability and Pain Knowledge Questionnaire). Questionnaire with 12 questions with three possible answers (True, False, Don’t know). 1 point is scored for each correct answer and 0 points for each incorrect answer or “Don’t know”. The total score ranges from 0 to 12 points, with higher scores indicating greater knowledge about pain.

  • HC-PAIRS (Health care providers’ attitudes and beliefs about functional impairments and chronic back pain). Questionnaire with 15 questions with 7 possible answers on a liker scale (from Totally disagree to Totally agree). It is scored from 1 to 7 points. The total score ranges from 15 to 105, with lower scores indicating greater knowledge about pain.

# Packages loading.
library(tidyverse)
library(knitr)
library(broom)
library(vtable)
library(tidymodels)
library(ggpubr)
library(car)
library(rstatix)
library(emmeans)
library(patchwork)
# Data loading.
df <- read_csv("datos/datos-2024-11-25.csv") |>
    # Select the variables of interest.
    select(Degree, Course, Gender, RNPQ, HC_PAIRS, starts_with("X"), starts_with("Y")) |> 
    # Convert the categorical variables to factors.
    mutate(Degree = factor(Degree), Course = factor(Course), Gender = factor(Gender)) |> 
    # Create a new variable with the combination of the degree and the course.
    mutate(Degree_Course = interaction(Degree, Course))

# Show 10 random rows of the data frame.
sample_n(df, 10) |> 
    kable()
Degree Course Gender RNPQ HC_PAIRS X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Degree_Course
physiotherapy first m 5 58 0 0 0 1 1 0 1 1 1 0 0 0 4 4 3 4 2 4 6 3 2 5 4 3 6 2 6 physiotherapy.first
nursing first f 8 64 0 0 1 1 1 1 1 1 0 1 1 0 4 5 4 4 5 3 5 6 6 5 4 4 4 5 4 nursing.first
medicine first f 5 63 0 0 1 0 1 0 0 1 0 1 1 0 5 7 4 3 7 1 3 2 4 7 4 4 7 4 1 medicine.first
nursing first f 6 80 0 0 1 0 1 0 1 1 0 0 1 1 2 7 6 7 6 1 1 5 7 7 3 1 7 4 6 nursing.first
medicine first f 6 58 0 0 1 0 1 0 1 1 1 0 1 0 6 7 4 4 4 3 5 6 5 5 4 3 4 6 2 medicine.first
pharmacy first f 3 70 0 0 0 0 1 0 0 1 0 1 0 0 6 7 6 4 5 4 3 7 6 7 4 5 5 6 1 pharmacy.first
medicine last m 11 62 1 0 1 1 1 1 1 1 1 1 1 1 4 6 3 6 4 3 7 5 5 7 4 2 6 4 2 medicine.last
nursing last f 11 58 0 1 1 1 1 1 1 1 1 1 1 1 7 3 5 2 5 2 4 6 5 5 2 3 5 1 3 nursing.last
nursing last m 5 71 1 0 0 0 0 0 1 1 0 0 1 1 5 6 6 5 4 3 5 7 5 6 3 6 6 4 4 nursing.last
pharmacy first f 6 63 0 0 0 0 0 1 1 1 1 0 1 1 3 5 6 3 4 3 4 5 6 4 4 5 4 4 1 pharmacy.first

Sample

Variables

# Show the structure of the data frame.
vt(df)
df
Name Class Values
Degree factor 'medicine' 'nursing' 'pharmacy' 'physiotherapy'
Course factor 'first' 'last'
Gender factor 'f' 'm'
RNPQ numeric Num: 0 to 12
HC_PAIRS numeric Num: 33 to 95
X1 numeric Num: 0 to 1
X2 numeric Num: 0 to 1
X3 numeric Num: 0 to 1
X4 numeric Num: 0 to 1
X5 numeric Num: 0 to 1
X6 numeric Num: 0 to 1
X7 numeric Num: 0 to 1
X8 numeric Num: 0 to 1
X9 numeric Num: 0 to 1
X10 numeric Num: 0 to 1
X11 numeric Num: 0 to 1
X12 numeric Num: 0 to 1
Y1 numeric Num: 1 to 7
Y2 numeric Num: 1 to 7
Y3 numeric Num: 1 to 7
Y4 numeric Num: 1 to 7
Y5 numeric Num: 1 to 7
Y6 numeric Num: 1 to 7
Y7 numeric Num: 1 to 7
Y8 numeric Num: 1 to 7
Y9 numeric Num: 1 to 7
Y10 numeric Num: 1 to 7
Y11 numeric Num: 1 to 7
Y12 numeric Num: 1 to 7
Y13 numeric Num: 1 to 7
Y14 numeric Num: 1 to 7
Y15 numeric Num: 1 to 7
Degree_Course factor 'medicine.first' 'nursing.first' 'pharmacy.first' 'physiotherapy.first' 'medicine.last' and 3 more

Sample size

The number of students who participated in the study was 715.

Sample size according to the degree.

df |> 
  count(Degree) |> 
  kable()
Degree n
medicine 156
nursing 175
pharmacy 130
physiotherapy 254

Sample size according to the course.

df |> 
  count(Course) |> 
  kable()
Course n
first 390
last 325

Sample size according to the degree and course.

table(df$Degree, df$Course) |> 
  kable()
first last
medicine 91 65
nursing 80 95
pharmacy 66 64
physiotherapy 153 101

Samples size according to the gender.

df |> 
  count(Gender) |> 
  kable()
Gender n
f 527
m 188

Descriptive analysis

Descriptive statistics

According to the degree

library(vtable)
st(df, vars = c("RNPQ", "HC_PAIRS"), group = "Degree", summ = c("mean(x)",   "sd(x)", "min(x)", "pctile(x)[25]", "median(x)", "pctile(x)[75]", "max(x)"), title = "Resumen descriptivo",  group.long = TRUE,  summ.names = c("Mean", "StdDev", "Min", "Q1", "Median", "Q3", "Max"), digits = 3)
Resumen descriptivo
Variable Mean StdDev Min Q1 Median Q3 Max
Degree: medicine
RNPQ 6.96 1.98 1 6 7 8 12
HC_PAIRS 62.7 9.31 38 57 62 69 88
Degree: nursing
RNPQ 7.09 1.81 2 6 7 8 12
HC_PAIRS 65.9 9.18 41 60 65 71 95
Degree: pharmacy
RNPQ 6.15 2.06 0 5 6 8 10
HC_PAIRS 67.4 8.13 51 62 67 73 92
Degree: physiotherapy
RNPQ 7.26 1.79 3 6 7 8 12
HC_PAIRS 59.4 9.62 33 53.2 59 66 88

According to the course

library(vtable)
st(df, vars = c("RNPQ", "HC_PAIRS"), group = "Course", summ = c("mean(x)",   "sd(x)", "min(x)", "pctile(x)[25]", "median(x)", "pctile(x)[75]", "max(x)"), title = "Resumen descriptivo",  group.long = TRUE,  summ.names = c("Mean", "StdDev", "Min", "Q1", "Median", "Q3", "Max"), digits = 3)
Resumen descriptivo
Variable Mean StdDev Min Q1 Median Q3 Max
Course: first
RNPQ 6.36 1.9 0 5 6 8 12
HC_PAIRS 64.4 8.62 39 59 64 70 95
Course: last
RNPQ 7.66 1.7 3 6 8 9 12
HC_PAIRS 61.7 10.7 33 55 61 68 92

According to the degree and course

library(vtable)
st(df, vars = c("RNPQ", "HC_PAIRS"), group = "Degree_Course", summ = c("mean(x)",   "sd(x)", "min(x)", "pctile(x)[25]", "median(x)", "pctile(x)[75]", "max(x)"), title = "Resumen descriptivo",  group.long = TRUE,  summ.names = c("Mean", "StdDev", "Min", "Q1", "Median", "Q3", "Max"), digits = 3)
Resumen descriptivo
Variable Mean StdDev Min Q1 Median Q3 Max
Degree_Course: medicine.first
RNPQ 6.08 1.84 1 5 6 7 12
HC_PAIRS 64 8.74 44 58 63 69 88
Degree_Course: nursing.first
RNPQ 7.14 1.85 2 6 7 8 12
HC_PAIRS 66 9.3 41 60.8 65 71 95
Degree_Course: pharmacy.first
RNPQ 5.2 2 0 4 5 6 10
HC_PAIRS 66.6 7.37 51 62 65.5 71 87
Degree_Course: physiotherapy.first
RNPQ 6.61 1.64 3 6 7 8 10
HC_PAIRS 62.8 8.41 39 57 62 68 88
Degree_Course: medicine.last
RNPQ 8.18 1.43 5 7 8 9 11
HC_PAIRS 60.8 9.82 38 55 60 68 83
Degree_Course: nursing.last
RNPQ 7.04 1.78 3 6 7 8 12
HC_PAIRS 65.8 9.13 46 60 65 71 90
Degree_Course: pharmacy.last
RNPQ 7.14 1.6 4 6 7 8 10
HC_PAIRS 68.2 8.83 51 62 67 73.2 92
Degree_Course: physiotherapy.last
RNPQ 8.25 1.55 5 7 8 9 12
HC_PAIRS 54.4 9.15 33 48 54 60 80

Boxplot

df |>
    ggplot(aes(x = Degree, y = RNPQ, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the RNPQ score by degree",
         x = "Degree",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Course, y = RNPQ, fill = Course)) +
    geom_boxplot() +
    labs(title = "Distribution of the RNPQ score by course",
         x = "Course",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Degree, y = RNPQ, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the RNPQ score by degree and course",
         x = "Degree",
         y = "Score") +
    facet_grid(. ~ Course) +
    theme_minimal()    

df |>
    ggplot(aes(x = Degree, y = HC_PAIRS, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the HC-PAIRS score by degree",
         x = "Degree",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Course, y = HC_PAIRS, fill = Course)) +
    geom_boxplot() +
    labs(title = "Distribution of the HC-PAIRS score by course",
         x = "Course",
         y = "Score") +
    theme_minimal()    

df |>
    ggplot(aes(x = Degree, y = HC_PAIRS, fill = Degree)) +
    geom_boxplot() +
    labs(title = "Distribution of the HC-PAIRS score by degree and course",
         x = "Degree",
         y = "Score") +
    facet_grid(. ~ Course) +
    theme_minimal()    

Barplot

df |> 
    ggplot(aes(x = RNPQ, fill = Course)) +
    geom_bar(aes(y = ..count../sum(..count..))) +
    labs(title = "Distribution of the RNPQ score by course",
         x = "RNPQ Score",
         y = "Frequency") +
    facet_grid(Course ~ .) +
    theme_minimal()

df |> 
    ggplot(aes(x = RNPQ, fill = Degree)) +
    geom_bar(aes(y = ..count../sum(..count..))) +
    labs(title = "Distribution of the RNPQ score by degree",
         x = "RNPQ Score",
         y = "Frequency") +
    facet_grid(Degree ~ .) +
    theme_minimal()

Histogram

df |> 
    ggplot(aes(x = HC_PAIRS, fill = Course)) +
    geom_histogram(aes(y = ..count../sum(..count..)), color = "white") +
    labs(title = "Distribution of the HC-PAIRS score by course",
         x = "HC-PAIRS score",
         y = "Frequency") +
    facet_grid(Course ~ .) +
    theme_minimal()

df |> 
    ggplot(aes(x = HC_PAIRS, fill = Degree)) +
    geom_histogram(aes(y = ..count../sum(..count..)), color = "white") +
    labs(title = "Distribution of the HC-PAIRS score by degree",
         x = "HC-PAIRS score",
         y = "Frequency") +
    facet_grid(Degree ~ .) +
    theme_minimal()

Means plot

ggline(df, x = "Course", y = "RNPQ", color = "Degree", add = c("mean_ci"), position = position_dodge(0.2), xlab = "Course", ylab = "RNPQ score", main = "Confidence intervals for the means of RNPQ score by degree and course", legend = "top")

Caution

It looks that the pain knowledge according to RNPQ score decreases in the last year of Nursing degree, that is very weird.

ggline(df, x = "Course", y = "HC_PAIRS", color = "Degree", add = c("mean_ci"), position = position_dodge(0.2), xlab = "Course", ylab = "HC-PAIRS score", main = "Confidence intervals for the means of HC-PAIRS score by degree and course", legend = "top")

Caution

It looks that the pain knowledge according to HC_PAIRS decreases in the last year of Pharmacy degree, that is very weird.

Comparison of RNPQ scores

As the dependent variable si quantitative (discrete) and the dependent variables are qualitative, we perform a two-factors ANOVA test for comparing the means of RNPQ score for the groups based on degree and course.

ANOVA

anova <- aov(RNPQ ~ Degree * Course, data = df)
# Use the type III sum of squares as the groups are unbalanced.
Anova(anova, type = "III") |>
    tidy() |>
    mutate(p.value = format(p.value, scientific = TRUE, digits = 2)) |>
    kable()
term sumsq df statistic p.value
(Intercept) 3360.5385 1 1145.12317 5.4e-150
Degree 154.8222 3 17.58552 5.1e-11
Course 168.4397 1 57.39683 1.1e-13
Degree:Course 128.7910 3 14.62876 3.0e-09
Residuals 2074.7992 707 NA NA

The ANOVA test shows both, the degree and the course, have a statistically significant effect on the scores of the RNPQ questionnaire, and also the interaction between the degree and the course has a statistically significant effect (p-value < 0.01).

ANOVA assumptions

Now we study the residuals of the model to check the ANOVA assumptions are meet.

# Add residuals and fitted scores to the data frame.
df <- df |> mutate(
    predictions = fitted(anova),
    residuals = residuals(anova)
)

First we check normality of residuals.

par(mfrow = c(1, 2))  
hist(df$residuals)
qqnorm(df$residuals, main = "Q-Q Plot of Residuals")
qqline(df$residuals, col = "red", lwd = 2)

According to the charts there is no significant departure from normality.

After we check homocedasticity.

leveneTest(RNPQ ~ Degree * Course, data = df) |> 
    tidy() |> 
    kable()
statistic p.value df df.residual
0.9173686 0.4921935 7 707

The variances of the scores of the RNPQ questionnaire are homogeneous in all the degree-course groups (p-value > 0.05).

Pairwise comparison of means

As the interaction of degree and course is significant, we estimate the mean for each degree-course group.

marginal_means <- emmeans(anova, ~ Degree * Course)
marginal_means |> 
    tidy(conf.int = TRUE) |> 
    select(-statistic, -p.value) |> 
    arrange(desc(estimate)) |> 
    kable()
Degree Course estimate std.error df conf.low conf.high
physiotherapy last 8.247525 0.1704581 707 7.912860 8.582189
medicine last 8.184615 0.2124818 707 7.767445 8.601786
pharmacy last 7.140625 0.2141353 707 6.720208 7.561042
nursing first 7.137500 0.1915285 707 6.761467 7.513533
nursing last 7.042105 0.1757586 707 6.697034 7.387176
physiotherapy first 6.614379 0.1384945 707 6.342469 6.886289
medicine first 6.076923 0.1795799 707 5.724349 6.429497
pharmacy first 5.196970 0.2108659 707 4.782971 5.610968
marginal_means |> 
    as_tibble() |> 
    mutate(DegreeCourse = paste(Degree, Course, sep = "-")) |> 
    mutate(DegreeCourse = reorder(DegreeCourse, emmean)) |> 
    ggplot(aes(x = emmean, y = DegreeCourse)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    labs(title = "Estimated means of RNPQ score",
       x = "Estimated Mean",
       y = "Degree - Course") +
    theme_minimal()

Now we compare the means of all the groups by pairs.

marginal_means |>
    pairs(infer = T) |>
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast estimate SE df lower.CL upper.CL t.ratio p.value
medicine first - medicine last -2.1076923 0.2782039 707 -2.9534371 -1.2619475 -7.5760695 0.0000000
medicine first - physiotherapy last -2.1706017 0.2475982 707 -2.9233046 -1.4178987 -8.7666278 0.0000000
nursing first - pharmacy first 1.9405303 0.2848642 707 1.0745383 2.8065223 6.8121253 0.0000000
pharmacy first - medicine last -2.9876457 0.2993542 707 -3.8976876 -2.0776038 -9.9803037 0.0000000
pharmacy first - nursing last -1.8451356 0.2745096 707 -2.6796494 -1.0106217 -6.7215708 0.0000000
pharmacy first - physiotherapy last -3.0505551 0.2711464 707 -3.8748449 -2.2262652 -11.2505810 0.0000000
physiotherapy first - physiotherapy last -1.6331457 0.2196285 707 -2.3008203 -0.9654711 -7.4359444 0.0000000
pharmacy first - pharmacy last -1.9436553 0.3005302 707 -2.8572722 -1.0300384 -6.4674219 0.0000000
physiotherapy first - medicine last -1.5702363 0.2536321 707 -2.3412822 -0.7991904 -6.1910003 0.0000000
pharmacy first - physiotherapy first -1.4174094 0.2522799 707 -2.1843448 -0.6504740 -5.6183992 0.0000008
nursing last - physiotherapy last -1.2054195 0.2448409 707 -1.9497399 -0.4610991 -4.9232776 0.0000290
nursing first - physiotherapy last -1.1100248 0.2563964 707 -1.8894743 -0.3305752 -4.3293303 0.0004528
medicine last - nursing last 1.1425101 0.2757527 707 0.3042170 1.9808032 4.1432412 0.0009976
pharmacy last - physiotherapy last -1.1068998 0.2736967 707 -1.9389426 -0.2748569 -4.0442562 0.0014970
medicine first - nursing first -1.0605769 0.2625492 707 -1.8587310 -0.2624228 -4.0395359 0.0015259
medicine first - nursing last -0.9651822 0.2512768 707 -1.7290679 -0.2012965 -3.8411122 0.0033346
medicine first - pharmacy last -1.0637019 0.2794689 707 -1.9132922 -0.2141116 -3.8061548 0.0038104
nursing first - medicine last -1.0471154 0.2860623 707 -1.9167498 -0.1774810 -3.6604449 0.0065493
medicine last - pharmacy last 1.0439904 0.3016661 707 0.1269202 1.9610606 3.4607480 0.0132387
medicine first - pharmacy first 0.8799534 0.2769718 707 0.0379544 1.7219523 3.1770509 0.0332132
medicine first - physiotherapy first -0.5374560 0.2267811 707 -1.2268744 0.1519624 -2.3699331 0.2578662
nursing first - physiotherapy first 0.5231209 0.2363554 707 -0.1954037 1.2416455 2.2132807 0.3448981
physiotherapy first - pharmacy last -0.5262459 0.2550190 707 -1.3015080 0.2490162 -2.0635559 0.4398056
physiotherapy first - nursing last -0.4277262 0.2237673 707 -1.1079827 0.2525303 -1.9114774 0.5434004
nursing first - nursing last 0.0953947 0.2599504 707 -0.6948591 0.8856486 0.3669728 0.9999577
nursing last - pharmacy last -0.0985197 0.2770289 707 -0.9406924 0.7436530 -0.3556298 0.9999658
medicine last - physiotherapy last -0.0629094 0.2724050 707 -0.8910252 0.7652064 -0.2309406 0.9999982
nursing first - pharmacy last -0.0031250 0.2872927 707 -0.8764998 0.8702498 -0.0108774 1.0000000

As many of this comparisons make no sense, we perform a pairwise comparison of degrees by course.

marginal_means|> 
    pairs(infer = T, by = "Course") |>   
    as_tibble() |> 
    arrange(Course, p.value) |> 
    kable()
contrast Course estimate SE df lower.CL upper.CL t.ratio p.value
nursing - pharmacy first 1.9405303 0.2848642 707 1.2069620 2.6740986 6.8121253 0.0000000
pharmacy - physiotherapy first -1.4174094 0.2522799 707 -2.0670684 -0.7677503 -5.6183992 0.0000002
medicine - nursing first -1.0605769 0.2625492 707 -1.7366809 -0.3844730 -4.0395359 0.0003456
medicine - pharmacy first 0.8799534 0.2769718 707 0.1667091 1.5931976 3.1770509 0.0084288
medicine - physiotherapy first -0.5374560 0.2267811 707 -1.1214517 0.0465396 -2.3699331 0.0838778
nursing - physiotherapy first 0.5231209 0.2363554 707 -0.0855301 1.1317720 2.2132807 0.1205894
nursing - physiotherapy last -1.2054195 0.2448409 707 -1.8359218 -0.5749172 -4.9232776 0.0000063
medicine - nursing last 1.1425101 0.2757527 707 0.4324050 1.8526152 4.1432412 0.0002243
pharmacy - physiotherapy last -1.1068998 0.2736967 707 -1.8117103 -0.4020892 -4.0442562 0.0003389
medicine - pharmacy last 1.0439904 0.3016661 707 0.2671545 1.8208263 3.4607480 0.0031964
nursing - pharmacy last -0.0985197 0.2770289 707 -0.8119112 0.6148717 -0.3556298 0.9845765
medicine - physiotherapy last -0.0629094 0.2724050 707 -0.7643934 0.6385747 -0.2309406 0.9956593
pwpp(marginal_means, by = "Course") +
    labs(title = "P-values of pairwise comparisons of the RNPQ means by course")

plot(marginal_means, by = "Course", comparisons = TRUE) +
    labs(title = "95% confidence intervals for the estimated means\nof RNPQ scores and pairwise comparisons", x = "Estimated means") +
    theme_minimal()

marginal_means|> 
    pairs(infer = T, by = "Course") |>
    as_tibble() |> 
    filter(Course == "first") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "95% confidence intervals for the difference between the RNPQ means in the last course",
       y = "Degrees") +
    theme_minimal()

In the first course there are significant differences between nursing and pharmacy, pharmacy and physioterapy, medicine and nursing and medicine and pharmacy.

marginal_means|> 
    pairs(infer = T, by = "Course") |>
    as_tibble() |> 
    filter(Course == "last") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "95% confidence intervals for the difference between the RNPQ means in the last course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the last course, there are significant differences between nursing and physiotherapy, medicine and nursing, pharmacy and physiotherapy and medicine and nursing.

marginal_means|> 
    eff_size(sigma = sigma(anova), edf = 708, by = "Course") |> 
    as_tibble() |>
    arrange(Course, desc(abs(effect.size))) |>
    kable()
contrast Course effect.size SE df lower.CL upper.CL
nursing - pharmacy first 1.1327709 0.1689903 707 0.8009881 1.4645537
pharmacy - physiotherapy first -0.8274028 0.1488991 707 -1.1197401 -0.5350655
medicine - nursing first -0.6191043 0.1541418 707 -0.9217348 -0.3164739
medicine - pharmacy first 0.5136666 0.1622555 707 0.1951062 0.8322270
medicine - physiotherapy first -0.3137362 0.1326442 707 -0.5741598 -0.0533126
nursing - physiotherapy first 0.3053682 0.1382093 707 0.0340184 0.5767179
nursing - physiotherapy last -0.7036552 0.1441422 707 -0.9866532 -0.4206572
medicine - nursing last 0.6669323 0.1619415 707 0.3489884 0.9848761
pharmacy - physiotherapy last -0.6461450 0.1606886 707 -0.9616290 -0.3306609
medicine - pharmacy last 0.6094221 0.1768386 707 0.2622303 0.9566138
nursing - pharmacy last -0.0575102 0.1617209 707 -0.3750209 0.2600005
medicine - physiotherapy last -0.0367229 0.1590175 707 -0.3489259 0.2754801

The biggest effect size detected was between Physiotherapy and Nursing, followed by Medicine and Nursing, Physiotherapy and Pharmacy and Medicine and Pharmacy, all of them with an effect size d>0.5.

In the same way, we compare the courses for each degree.

marginal_means |> 
    pairs(infer = T, by = "Degree") |> 
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast Degree estimate SE df lower.CL upper.CL t.ratio p.value
first - last medicine -2.1076923 0.2782039 707 -2.6538970 -1.561488 -7.5760695 0.0000000
first - last physiotherapy -1.6331457 0.2196285 707 -2.0643479 -1.201943 -7.4359444 0.0000000
first - last pharmacy -1.9436553 0.3005302 707 -2.5336937 -1.353617 -6.4674219 0.0000000
first - last nursing 0.0953947 0.2599504 707 -0.4149725 0.605762 0.3669728 0.7137491
pwpp(marginal_means, by = "Degree") +
    labs(title = "P-values of comparisons of the RNPQ means\nof first and last courses by dregee")

plot(marginal_means, by = "Degree", comparisons = TRUE) +
    labs(title = "95% confidence intervals for the estimated means\nof RNPQ scores and pairwise comparisons", x = "Estimated means") +
    theme_minimal()

marginal_means|> 
    pairs(infer = T, by = "Degree") |>
    as_tibble() |> 
    mutate(Degree = reorder(Degree, estimate)) |> 
    ggplot(aes(x = estimate, y = Degree)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference between the first and last courses means by degrees",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

There are significant differences between the first and the last courses in medicine, pharmacy and physiotherapy, but not in nursing.

marginal_means|> 
    eff_size(sigma = sigma(anova), edf = 708, by = "Degree") |> 
    as_tibble() |>
    arrange(desc(abs(effect.size))) |>
    kable()
contrast Degree effect.size SE df lower.CL upper.CL
first - last medicine -1.2303506 0.1656583 707 -1.5555916 -0.9051095
first - last pharmacy -1.1345951 0.1780046 707 -1.4840760 -0.7851143
first - last physiotherapy -0.9533373 0.1306858 707 -1.2099160 -0.6967586
first - last nursing 0.0556860 0.1517515 707 -0.2422514 0.3536234

Comparison of HC-PAIRS scores

ANOVA

anova <- aov(HC_PAIRS ~ Degree * Course, data = df)

anova |>
    tidy() |>
    kable()
term df sumsq meansq statistic p.value
Degree 3 7179.880 2393.29340 30.61140 0e+00
Course 1 1997.783 1997.78262 25.55262 5e-07
Degree:Course 3 2768.690 922.89666 11.80430 1e-07
Residuals 707 55275.443 78.18309 NA NA

The ANOVA test shows both, the degree and the course, have a statistically significant effect on the scores of the HC-PAIRS questionnaire, and also the interaction between the degree and the course has a statistically significant effect (p-value < 0.01).

ANOVA assumptions

Now we study the residuals of the model to check the ANOVA assumptions are meet.

# Add residuals and fitted scores to the data frame.
df <- df |> mutate(
    predictions = fitted(anova),
    residuals = residuals(anova)
)

First we check normality of residuals.

par(mfrow = c(1, 2))  
hist(df$residuals)
qqnorm(df$residuals, main = "Q-Q Plot of Residuals")
qqline(df$residuals, col = "red", lwd = 2)

According to the charts there is no significant departure from normality.

After we check homocedasticity.

leveneTest(HC_PAIRS ~ Degree * Course, data = df) |> 
    tidy() |> 
    kable()
statistic p.value df df.residual
0.7224541 0.6530113 7 707

The variances of the scores of the HC_PAIRS questionnaire are homogeneous in all the degree-course groups (p-value > 0.05).

par(mfrow = c(1,1))
plot(anova, which=1)

The scatter plot of residuals vs predicted scores shows homogeneity of variances.

Pairwise comparison of means

As the interaction of degree and course is significant, we estimate the mean for each degree-course group.

marginal_means <- emmeans(anova, specs =  ~ Degree:Course, weights = "cells")
marginal_means |> 
    tidy(conf.int = TRUE) |> 
    select(-statistic, -p.value) |> 
    arrange(desc(estimate)) |> 
    kable()
Degree Course estimate std.error df conf.low conf.high
pharmacy last 68.20312 1.1052650 707 66.03313 70.37312
pharmacy first 66.63636 1.0883897 707 64.49950 68.77323
nursing first 66.00000 0.9885791 707 64.05910 67.94090
nursing last 65.78947 0.9071824 707 64.00838 67.57057
medicine first 64.01099 0.9269060 707 62.19117 65.83081
physiotherapy first 62.79085 0.7148430 707 61.38738 64.19432
medicine last 60.81538 1.0967300 707 58.66215 62.96862
physiotherapy last 54.38614 0.8798238 707 52.65876 56.11352
marginal_means |> 
    as_tibble() |> 
    mutate(DegreeCourse = paste(Degree, Course, sep = "-")) |> 
    mutate(DegreeCourse = reorder(DegreeCourse, emmean)) |> 
    ggplot(aes(x = emmean, y = DegreeCourse)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    labs(title = "Estimated means of HC_PAIRS score",
       x = "Estimated Mean",
       y = "Degree - Course") +
    theme_minimal()

Now we compare the means of all the groups by pairs.

contrast(marginal_means, method = "pairwise") |>
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast estimate SE df t.ratio p.value
medicine first - physiotherapy last 9.6248504 1.277985 707 7.5312723 0.0000000
nursing first - physiotherapy last 11.6138614 1.323397 707 8.7757981 0.0000000
pharmacy first - physiotherapy last 12.2502250 1.399529 707 8.7531038 0.0000000
physiotherapy first - physiotherapy last 8.4047111 1.133618 707 7.4140570 0.0000000
nursing last - physiotherapy last 11.4033351 1.263752 707 9.0233942 0.0000000
pharmacy last - physiotherapy last 13.8169864 1.412693 707 9.7806028 0.0000000
medicine last - pharmacy last -7.3877404 1.557057 707 -4.7446811 0.0000687
medicine last - physiotherapy last 6.4292460 1.406025 707 4.5726394 0.0001531
physiotherapy first - pharmacy last -5.4122753 1.316287 707 -4.1117747 0.0011362
pharmacy first - medicine last 5.8209790 1.545124 707 3.7673210 0.0044121
nursing first - medicine last 5.1846154 1.476518 707 3.5113798 0.0111229
medicine last - nursing last -4.9740891 1.423305 707 -3.4947462 0.0117816
pharmacy first - physiotherapy first 3.8455140 1.302149 707 2.9532051 0.0639970
medicine first - pharmacy last -4.1921360 1.442486 707 -2.9061886 0.0728516
nursing first - physiotherapy first 3.2091503 1.219955 707 2.6305490 0.1467208
physiotherapy first - nursing last -2.9986240 1.154981 707 -2.5962546 0.1589053
medicine first - medicine last 3.1956044 1.435957 707 2.2254184 0.3376603
medicine first - pharmacy first -2.6253746 1.429597 707 -1.8364441 0.5952845
nursing last - pharmacy last -2.4136513 1.429892 707 -1.6879958 0.6951533
physiotherapy first - medicine last 1.9754651 1.309128 707 1.5089925 0.8026429
nursing first - pharmacy last -2.2031250 1.482869 707 -1.4857183 0.8150892
medicine first - nursing first -1.9890110 1.355154 707 -1.4677376 0.8244253
medicine first - nursing last -1.7784847 1.296971 707 -1.3712598 0.8701161
medicine first - physiotherapy first 1.2201393 1.170536 707 1.0423763 0.9677587
pharmacy first - pharmacy last -1.5667614 1.551194 707 -1.0100357 0.9729410
pharmacy first - nursing last 0.8468900 1.416888 707 0.5977112 0.9989049
nursing first - pharmacy first -0.6363636 1.470334 707 -0.4328022 0.9998708
nursing first - nursing last 0.2105263 1.341741 707 0.1569054 0.9999999

As many of this comparisons make no sense, we perform a pairwise comparison of degrees by course.

emmeans(anova, ~ Degree | Course, weights = "cells") |> 
    pairs() |> 
    as_tibble() |> 
    arrange(Course, p.value) |> 
    kable()
contrast Course estimate SE df t.ratio p.value
pharmacy - physiotherapy first 3.8455140 1.302149 707 2.9532051 0.0170911
nursing - physiotherapy first 3.2091503 1.219955 707 2.6305490 0.0431372
medicine - pharmacy first -2.6253746 1.429597 707 -1.8364441 0.2570781
medicine - nursing first -1.9890110 1.355154 707 -1.4677376 0.4576519
medicine - physiotherapy first 1.2201393 1.170536 707 1.0423763 0.7245492
nursing - pharmacy first -0.6363636 1.470334 707 -0.4328022 0.9728210
nursing - physiotherapy last 11.4033351 1.263752 707 9.0233942 0.0000000
pharmacy - physiotherapy last 13.8169864 1.412693 707 9.7806028 0.0000000
medicine - pharmacy last -7.3877404 1.557057 707 -4.7446811 0.0000150
medicine - physiotherapy last 6.4292460 1.406025 707 4.5726394 0.0000336
medicine - nursing last -4.9740891 1.423305 707 -3.4947462 0.0028304
nursing - pharmacy last -2.4136513 1.429892 707 -1.6879958 0.3307542
emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    filter(Course == "first") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference of HC_PAIRS means in the first course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the first course there are significant differences between pharmacy and physiotherapy and between nursing and physioterapy.

emmeans(anova, ~ Degree | Course) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    filter(Course == "last") |> 
    mutate(contrast = reorder(contrast, estimate)) |> 
    ggplot(aes(x = estimate, y = contrast)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference of HC_PAIRS means in the last course",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

In the last course, there are significant differences between pharmacy and physiotherapy, nursing and physiotherapy, medicine and pharmacy, medicine and physiotherapy and nursing and pharmacy.

In the same way, we compare the courses for each degree.

contrast(marginal_means, method = "pairwise", by = "Degree", adjust = "tukey")
Degree = medicine:
 contrast     estimate   SE  df t.ratio p.value
 first - last    3.196 1.44 707   2.225  0.0264

Degree = nursing:
 contrast     estimate   SE  df t.ratio p.value
 first - last    0.211 1.34 707   0.157  0.8754

Degree = pharmacy:
 contrast     estimate   SE  df t.ratio p.value
 first - last   -1.567 1.55 707  -1.010  0.3128

Degree = physiotherapy:
 contrast     estimate   SE  df t.ratio p.value
 first - last    8.405 1.13 707   7.414  <.0001
emmeans(anova, ~ Course | Degree, weights = "cells") |> 
    pairs() |> 
    as_tibble() |> 
    arrange(p.value) |> 
    kable()
contrast Degree estimate SE df t.ratio p.value
first - last physiotherapy 8.4047111 1.133618 707 7.4140570 0.0000000
first - last medicine 3.1956044 1.435957 707 2.2254184 0.0263678
first - last pharmacy -1.5667614 1.551194 707 -1.0100357 0.3128237
first - last nursing 0.2105263 1.341741 707 0.1569054 0.8753642
emmeans(anova, ~ Course | Degree) |> 
    pairs() |> 
    confint() |> 
    as_tibble() |> 
    mutate(Degree = reorder(Degree, estimate)) |> 
    ggplot(aes(x = estimate, y = Degree)) +
    geom_errorbar(aes(xmin = lower.CL, xmax = upper.CL), size = 1, width = 0.2, color = myblue) +
    geom_point(size = 3, color = myred) +
    geom_vline(xintercept = 0, linetype = "dashed") +
    labs(title = "Difference between the first and last courses means by degrees",
       x = "Estimated Mean difference",
       y = "Degrees") +
    theme_minimal()

There are significant differences between the first and the last courses in pysiotherapy and medicien, but not in nursing and pharmacy.